Probabilistic Neural Network Based English-Arabic Sentence Alignment
نویسندگان
چکیده
In this paper, we present a new approach to align sentences in bilingual parallel corpora based on a probabilistic neural network (P-NNT) classifier. A feature parameter vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuation score, and cognate score values. A set of manually aligned training data was used to train the probabilistic neural network. Another set of data was used for testing. Using the probabilistic neural network approach, an error reduction of 27% was achieved over the length based approach when applied on English-Arabic parallel documents.
منابع مشابه
Neobility at SemEval-2017 Task 1: An Attention-based Sentence Similarity Model
This paper describes a neural-network model which performed competitively (top 6) at the SemEval 2017 cross-lingual Semantic Textual Similarity (STS) task. Our system employs an attention-based recurrent neural network model that optimizes the sentence similarity. In this paper, we describe our participation in the multilingual STS task which measures similarity across English, Spanish, and Ara...
متن کاملNeural Network-based Word Alignment through Score Aggregation
We present a simple neural network for word alignment that builds source and target word window representations to compute alignment scores for sentence pairs. To enable unsupervised training, we use an aggregation operation that summarizes the alignment scores for a given target word. A soft-margin objective increases scores for true target words while decreasing scores for target words that a...
متن کاملProbabilistic Graph-based Dependency Parsing with Convolutional Neural Network
This paper presents neural probabilistic parsing models which explore up to thirdorder graph-based parsing with maximum likelihood training criteria. Two neural network extensions are exploited for performance improvement. Firstly, a convolutional layer that absorbs the influences of all words in a sentence is used so that sentence-level information can be effectively captured. Secondly, a line...
متن کاملExtending a probabilistic phrase alignment approach for SMT
Phrase alignment is a crucial step in phrase-based statistical machine translation. We explore a way of improving phrase alignment by adding syntactic information in the form of chunks as soft constraints guided by an in-depth and detailed analysis on a hand-aligned data set. We extend a probabilistic phrase alignment model that extracts phrase pairs by optimizing phrase pair boundaries over th...
متن کاملEnglish Sentence Recognition using Artificial Neural Network through Mouse-based Gestures
Handwriting is one of the most important means of daily communication. Although the problem of handwriting recognition has been considered for more than 60 years there are still many open issues, especially in the task of unconstrained handwritten sentence recognition. This paper focuses on the automatic system that recognizes continuous English sentence through a mouse-based gestures in real-t...
متن کامل